Skip to content

fix(format): pass row body size, not full payload size, to BinaryRow.pointTo#3715

Merged
chaokunyang merged 3 commits into
apache:mainfrom
stevenschlansker:fory-docs-encoders-cleanup
May 31, 2026
Merged

fix(format): pass row body size, not full payload size, to BinaryRow.pointTo#3715
chaokunyang merged 3 commits into
apache:mainfrom
stevenschlansker:fory-docs-encoders-cleanup

Conversation

@stevenschlansker
Copy link
Copy Markdown
Contributor

@stevenschlansker stevenschlansker commented May 29, 2026

Why?

Minor fixes

  • Latent bug in BinaryRowEncoder exposes 8 bytes beyond end of intended buffer to BinaryRow
  • Add reminder documentation for byte[] encoder forms why embedded size is ignored. Tripped up both me and Claude
  • Clean up docstring. Identified during AI review of forthcoming feature branches

AI Contribution Checklist

  • Substantial AI assistance was used in this PR: yes / no

AI Usage Disclosure

  • substantial_ai_assistance: yes
  • scope: all
  • affected_files_or_subsystems: Java row format
  • ai_review: ✔️
  • ai_review_artifacts:

● Done. Three commits amended on fory-docs-encoders-cleanup: tightened comments, tightened test, tightened commit messages, no functional changes since the previous fix was already correct. Force-push when you're ready.

  • human_verification: ✔️
  • performance_verification: N/A
  • provenance_license_confirmation: ✔️

Does this PR introduce any user-facing change?

No

@stevenschlansker stevenschlansker force-pushed the fory-docs-encoders-cleanup branch from 370d762 to f4facff Compare May 29, 2026 18:33
@stevenschlansker
Copy link
Copy Markdown
Contributor Author

I'm not understanding the CI failure. The license header check fails with no output.

* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevenschlansker the ci is failing because it expects an empty line here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, running bash ci/format.sh --java per docs does not fix it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that is interesting, actually the check follows GoogleJavaFormat and it doesn't accounts for that empty line, need to move license header to that file which enforces this.

@stevenschlansker stevenschlansker force-pushed the fory-docs-encoders-cleanup branch from f4facff to 0b25586 Compare May 31, 2026 03:14
Claude (on behalf of Steven Schlansker) added 3 commits May 30, 2026 20:17
…pointTo

BinaryRowEncoder.decode consumes an 8-byte schema hash, then pointed the row at
the rest of the buffer with the full payload size. The row's sizeInBytes was
therefore 8 too large, so copy(), toBytes(), and getSizeInBytes() on that row
would include 8 trailing bytes of unrelated buffer content.

The bug is latent: decode does not expose the row to callers, and the reader
index was already advanced correctly by size - 8. The regression test substitutes
a capturing GeneratedRowEncoder and asserts BinaryRow.getSizeInBytes() equals the
payload minus the schema hash.
The class javadoc carried a trailing "<p>, ganrunsheng" line — a truncated
attribution. Keep only the one-sentence description.
The MemoryBuffer overloads respect sizeEmbedded for stream framing; the byte[]
overloads do not, because encode(T) writes no size prefix and decode(byte[]) uses
bytes.length. Comment the three byte[] decode methods so the asymmetry isn't
mistaken for a bug.
@stevenschlansker stevenschlansker force-pushed the fory-docs-encoders-cleanup branch from 0b25586 to 260508c Compare May 31, 2026 03:17
@chaokunyang chaokunyang merged commit 598e82f into apache:main May 31, 2026
56 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants